Exploiting logical document structure for anaphora resolution
نویسندگان
چکیده
The aim of the paper is twofold. Firstly, an approach is presented how to select the correct antecedent for an anaphoric element according to the kind of text segments in which both of them occur. Basically, information on logical text structure (e.g. chapters, sections, paragraphs) is used in order to select the antecedent life span of a linguistic expression, i.e. some linguistic expressions are more likely to be chosen as an antecedent throughout the whole text than others. In addition, an appropriate search scope for an anaphora expressed by an expression can be defined according to the document structuring elements that include the linguistic expression. Corpus investigations give rise to the supposition that logical text structure influences the search scope of candidates for antecedents. Second, a solution is presented how to integrate the resources used for anaphora resolution. In this approach, multi-layered XML annotation is used in order to make a set of resources accessible for the anaphora resolution system.
منابع مشابه
Exploiting Syntactic Patterns as Clues in Zero-Anaphora Resolution
We approach the zero-anaphora resolution problem by decomposing it into intra-sentential and inter-sentential zeroanaphora resolution. For the former problem, syntactic patterns of the appearance of zero-pronouns and their antecedents are useful clues. Taking Japanese as a target language, we empirically demonstrate that incorporating rich syntactic pattern features in a state-of-the-art learni...
متن کاملAnaphora Resolution in Slot Grammar
We present three algorithms for resolving anaphora in Slot Grammar: (1) an algorithm for interpreting elliptical VPs in antecedent-contained deletion structures, subdeletion constructions, and intersentential cases; (2) a syntactic filter on pronominal coreference; and (3) an algorithm for identifying the binder of an anaphor (reflexive pronoun or the reciprocal phrase "each other"). These algo...
متن کاملAnaphora Resolution: A Multi-Strategy Approach
Anaphora resolution has proven to be a very difficult problem; it requires the integrated application of syntactic. semantic. and pragmatic knowledge. This paper examines the hypothesis that instead of attempting to construct a monolithic method for resolving anaphora. the combination of multiple strategies, each exploiting a different knowledge source, proves more effective -theoretically and ...
متن کاملPersian Printed Document Analysis and Page Segmentation
This paper presents, a hybrid method, low-resolution and high-resolution, for Persian page segmentation. In the low-resolution page segmentation, a pyramidal image structure is constructed for multiscale analysis and segments document image to a set of regions. By high-resolution page segmentation, by connected components analysis, each region is segmented to homogeneous regions and identifyi...
متن کاملAnaphora Resolution for Biomedical Literature by Exploiting Multiple Resources
In this paper, a resolution system is presented to tackle nominal and pronominal anaphora in biomedical literature by using rich set of syntactic and semantic features. Unlike previous researches, the verification of semantic association between anaphors and their antecedents is facilitated by exploiting more outer resources, including UMLS, WordNet, GENIA Corpus 3.02p and PubMed. Moreover, the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006